Goto

Collaborating Authors

 path constraint


Worst-Case Symbolic Constraints Analysis and Generalisation with Large Language Models

Koh, Daniel, Noller, Yannic, Pasareanu, Corina S., Skapars, Adrians, Sun, Youcheng

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated strong performance on coding tasks such as generation, completion and repair, but their ability to handle complex symbolic reasoning over code still remains underexplored. We introduce the task of worst-case symbolic constraints analysis, which requires inferring the symbolic constraints that characterise worst-case program executions; these constraints can be solved to obtain inputs that expose performance bottlenecks or denial-of-service vulnerabilities in software systems. We show that even state-of-the-art LLMs (e.g., GPT-5) struggle when applied directly on this task. To address this challenge, we propose WARP, an innovative neurosymbolic approach that computes worst-case constraints on smaller concrete input sizes using existing program analysis tools, and then leverages LLMs to generalise these constraints to larger input sizes. Concretely, WARP comprises: (1) an incremental strategy for LLM-based worst-case reasoning, (2) a solver-aligned neurosymbolic framework that integrates reinforcement learning with SMT (Satisfiability Modulo Theories) solving, and (3) a curated dataset of symbolic constraints. Experimental results show that WARP consistently improves performance on worst-case constraint reasoning. Leveraging the curated constraint dataset, we use reinforcement learning to fine-tune a model, WARP-1.0-3B, which significantly outperforms size-matched and even larger baselines. These results demonstrate that incremental constraint reasoning enhances LLMs' ability to handle symbolic reasoning and highlight the potential for deeper integration between neural learning and formal methods in rigorous program analysis.


HybridGen: VLM-Guided Hybrid Planning for Scalable Data Generation of Imitation Learning

Wang, Wensheng, Tan, Ning

arXiv.org Artificial Intelligence

The acquisition of large-scale and diverse demonstration data are essential for improving robotic imitation learning generalization. However, generating such data for complex manipulations is challenging in real-world settings. We introduce HybridGen, an automated framework that integrates Vision-Language Model (VLM) and hybrid planning. HybridGen uses a two-stage pipeline: first, VLM to parse expert demonstrations, decomposing tasks into expert-dependent (object-centric pose transformations for precise control) and plannable segments (synthesizing diverse trajectories via path planning); second, pose transformations substantially expand the first-stage data. Crucially, HybridGen generates a large volume of training data without requiring specific data formats, making it broadly applicable to a wide range of imitation learning algorithms, a characteristic which we also demonstrate empirically across multiple algorithms. Evaluations across seven tasks and their variants demonstrate that agents trained with HybridGen achieve substantial performance and generalization gains, averaging a 5% improvement over state-of-the-art methods. Notably, in the most challenging task variants, HybridGen achieves significant improvement, reaching a 59.7% average success rate, significantly outperforming Mimicgen's 49.5%. These results demonstrating its effectiveness and practicality.


Autonomous Horizon-based Asteroid Navigation With Observability-constrained Maneuvers

Anibha, Aditya Arjun, Oguri, Kenshiro

arXiv.org Artificial Intelligence

Asteroid exploration is a pertinent challenge due to the varying complexity of their dynamical environments, shape and communication delays due to distance. Thus, autonomous navigation methods are continually being developed and improved in current research to enable their safe exploration. These methods often involve using horizon-based Optical Navigation (OpNav) to determine the spacecraft's location, which is reliant on the visibility of the horizon. It is critical to ensure the reliability of this measurement such that the spacecraft may maintain an accurate state estimate throughout its mission. This paper presents an algorithm that generates control maneuvers for spacecraft to follow trajectories that allow continuously usable optical measurements to maintain system observability for safe navigation. This algorithm improves upon existing asteroid navigation capabilities by allowing the safe and robust autonomous targeting of various trajectories and orbits at a wide range of distances within optical measurement range. It is adaptable to different asteroid scenarios. Overall, the approach develops an all-encompassing system that simulates the asteroid dynamics, synthetic image generation, edge detection, horizon-based OpNav, filtering and observability-enhancing control.


ReKep: Spatio-Temporal Reasoning of Relational Keypoint Constraints for Robotic Manipulation

Huang, Wenlong, Wang, Chen, Li, Yunzhu, Zhang, Ruohan, Fei-Fei, Li

arXiv.org Artificial Intelligence

Representing robotic manipulation tasks as constraints that associate the robot and the environment is a promising way to encode desired robot behaviors. However, it remains unclear how to formulate the constraints such that they are 1) versatile to diverse tasks, 2) free of manual labeling, and 3) optimizable by off-the-shelf solvers to produce robot actions in real-time. In this work, we introduce Relational Keypoint Constraints (ReKep), a visually-grounded representation for constraints in robotic manipulation. Specifically, ReKep is expressed as Python functions mapping a set of 3D keypoints in the environment to a numerical cost. We demonstrate that by representing a manipulation task as a sequence of Relational Keypoint Constraints, we can employ a hierarchical optimization procedure to solve for robot actions (represented by a sequence of end-effector poses in SE(3)) with a perception-action loop at a real-time frequency. Furthermore, in order to circumvent the need for manual specification of ReKep for each new task, we devise an automated procedure that leverages large vision models and vision-language models to produce ReKep from free-form language instructions and RGB-D observations. We present system implementations on a wheeled single-arm platform and a stationary dual-arm platform that can perform a large variety of manipulation tasks, featuring multi-stage, in-the-wild, bimanual, and reactive behaviors, all without task-specific data or environment models. Website at https://rekep-robot.github.io.


DLLens: Testing Deep Learning Libraries via LLM-aided Synthesis

Li, Meiziniu, Li, Dongze, Liu, Jianmeng, Cao, Jialun, Tian, Yongqiang, Cheung, Shing-Chi

arXiv.org Artificial Intelligence

Testing is a major approach to ensuring the quality of deep learning (DL) libraries. Existing testing techniques commonly adopt differential testing to relieve the need for test oracle construction. However, these techniques are limited in finding implementations that offer the same functionality and generating diverse test inputs for differential testing. This paper introduces DLLens, a novel differential testing technique for DL library testing. Our insight is that APIs in different DL libraries are commonly designed to accomplish various computations for the same set of published DL algorithms. Although the mapping of these APIs is not often one-to-one, we observe that their computations can be mutually simulated after proper composition and adaptation. The use of these simulation counterparts facilitates differential testing for the detection of functional DL library bugs. Leveraging the insight, we propose DLLens as a novel mechanism that utilizes a large language model (LLM) to synthesize valid counterparts of DL library APIs. To generate diverse test inputs, DLLens incorporates a static analysis method aided by LLM to extract path constraints from all execution paths in each API and its counterpart's implementations. These path constraints are then used to guide the generation of diverse test inputs. We evaluate DLLens on two popular DL libraries, TensorFlow and PyTorch. Our evaluation shows that DLLens can synthesize counterparts for more than twice as many APIs found by state-of-the-art techniques on these libraries. Moreover, DLLens can extract 26.7% more constraints and detect 2.5 times as many bugs as state-of-the-art techniques. DLLens has successfully found 56 bugs in recent TensorFlow and PyTorch libraries. Among them, 41 are previously unknown, 39 of which have been confirmed by developers after reporting, and 19 of those confirmed bugs have been fixed by developers.


Code-Aware Prompting: A study of Coverage Guided Test Generation in Regression Setting using LLM

Ryan, Gabriel, Jain, Siddhartha, Shang, Mingyue, Wang, Shiqi, Ma, Xiaofei, Ramanathan, Murali Krishna, Ray, Baishakhi

arXiv.org Artificial Intelligence

Testing plays a pivotal role in ensuring software quality, yet conventional Search Based Software Testing (SBST) methods often struggle with complex software units, achieving suboptimal test coverage. Recent work using large language models (LLMs) for test generation have focused on improving generation quality through optimizing the test generation context and correcting errors in model outputs, but use fixed prompting strategies that prompt the model to generate tests without additional guidance. As a result LLM-generated test suites still suffer from low coverage. In this paper, we present SymPrompt, a code-aware prompting strategy for LLMs in test generation. SymPrompt's approach is based on recent work that demonstrates LLMs can solve more complex logical problems when prompted to reason about the problem in a multi-step fashion. We apply this methodology to test generation by deconstructing the testsuite generation process into a multi-stage sequence, each of which is driven by a specific prompt aligned with the execution paths of the method under test, and exposing relevant type and dependency focal context to the model. Our approach enables pretrained LLMs to generate more complete test cases without any additional training. We implement SymPrompt using the TreeSitter parsing framework and evaluate on a benchmark challenging methods from open source Python projects. SymPrompt enhances correct test generations by a factor of 5 and bolsters relative coverage by 26% for CodeGen2. Notably, when applied to GPT-4, symbolic path prompts improve coverage by over 2x compared to baseline prompting strategies.


Multi-Shooting Differential Dynamic Programming for Hybrid Systems using Analytical Derivatives

Singh, Shubham, Russell, Ryan P., Wensing, Patrick M.

arXiv.org Artificial Intelligence

Differential Dynamic Programming (DDP) is a popular technique used to generate motion for dynamic-legged robots in the recent past. However, in most cases, only the first-order partial derivatives of the underlying dynamics are used, resulting in the iLQR approach. Neglecting the second-order terms often slows down the convergence rate compared to full DDP. Multi-Shooting is another popular technique to improve robustness, especially if the dynamics are highly non-linear. In this work, we consider Multi-Shooting DDP for trajectory optimization of a bounding gait for a simplified quadruped model. As the main contribution, we develop Second-Order analytical partial derivatives of the rigid-body contact dynamics, extending our previous results for fixed/floating base models with multi-DoF joints. Finally, we show the benefits of a novel Quasi-Newton method for approximating second-order derivatives of the dynamics, leading to order-of-magnitude speedups in the convergence compared to the full DDP method.


Horswill

AAAI Conferences

We examine the use of constraint propagation for populating indoor game levels with enemies and other objects. We introduce a notion of path constraints, which bound some function over the possible paths a player might take, and show how to efficiently place objects while guaranteeing path constraints. This allows the system to guarantee that power-ups are balanced to the number of enemies occurring in the level, that they're placed early enough to be useful, that keys are not hidden behind the doors they are intended to unlock, and so on. We describe a constraint solver based on interval methods that allows natural processing of numeric constraints and show that it is efficient enough to be used even on very low-end platforms.


Automated Test Generation to Detect Individual Discrimination in AI Models

Agarwal, Aniya, Lohia, Pranay, Nagar, Seema, Dey, Kuntal, Saha, Diptikalyan

arXiv.org Artificial Intelligence

Dependability on AI models is of utmost importance to ensure full acceptance of the AI systems. One of the key aspects of the dependable AI system is to ensure that all its decisions are fair and not biased towards any individual. In this paper, we address the problem of detecting whether a model has an individual discrimination. Such a discrimination exists when two individuals who differ only in the values of their protected attributes (such as, gender/race) while the values of their non-protected ones are exactly the same, get different decisions. Measuring individual discrimination requires an exhaustive testing, which is infeasible for a non-trivial system. In this paper, we present an automated technique to generate test inputs, which is geared towards finding individual discrimination. Our technique combines the well-known technique called symbolic execution along with the local explainability for generation of effective test cases. Our experimental results clearly demonstrate that our technique produces 3.72 times more successful test cases than the existing state-of-the-art across all our chosen benchmarks.


Fast Procedural Level Population with Playability Constraints

Horswill, Ian D. (Northwestern University) | Foged, Leif (Northwestern University)

AAAI Conferences

We examine the use of constraint propagation for populating indoor game levels with enemies and other objects.  We introduce a notion of path constraints , which bound some function over the possible paths a player might take, and show how to efficiently place objects while guaranteeing path constraints.  This allows the system to guarantee that power-ups are balanced to the number of enemies occurring in the level, that they’re placed early enough to be useful, that keys are not hidden behind the doors they are intended to unlock, and so on. We describe a constraint solver based on interval methods that allows natural processing of numeric constraints and show that it is efficient enough to be used even on very low-end platforms.